Red-Wine-Quality-EDA by Mohamed Abido



Citation :

This dataset is public available for research. The details are described in [Cortez et al., 2009].

P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.

Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016 [Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf [bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib



Introduction :

  • There are 1599 observations and 13 variables in this data set.
  • Description of the dataset attributes available at link
Structure of Dataset
## 'data.frame':    1599 obs. of  13 variables:
##  $ X                   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ fixed.acidity       : num  7.4 7.8 7.8 11.2 7.4 7.4 7.9 7.3 7.8 7.5 ...
##  $ volatile.acidity    : num  0.7 0.88 0.76 0.28 0.7 0.66 0.6 0.65 0.58 0.5 ...
##  $ citric.acid         : num  0 0 0.04 0.56 0 0 0.06 0 0.02 0.36 ...
##  $ residual.sugar      : num  1.9 2.6 2.3 1.9 1.9 1.8 1.6 1.2 2 6.1 ...
##  $ chlorides           : num  0.076 0.098 0.092 0.075 0.076 0.075 0.069 0.065 0.073 0.071 ...
##  $ free.sulfur.dioxide : num  11 25 15 17 11 13 15 15 9 17 ...
##  $ total.sulfur.dioxide: num  34 67 54 60 34 40 59 21 18 102 ...
##  $ density             : num  0.998 0.997 0.997 0.998 0.998 ...
##  $ pH                  : num  3.51 3.2 3.26 3.16 3.51 3.51 3.3 3.39 3.36 3.35 ...
##  $ sulphates           : num  0.56 0.68 0.65 0.58 0.56 0.56 0.46 0.47 0.57 0.8 ...
##  $ alcohol             : num  9.4 9.8 9.8 9.8 9.4 9.4 9.4 10 9.5 10.5 ...
##  $ quality             : int  5 5 5 6 5 5 5 7 7 5 ...
Adding a wine garde column :
  • Grades :
    • Quality 8 to 7 are rated “A”
    • Quality 6 to 5 are rated “B”
    • Quality 3 to 4 are rated “C”
## 'data.frame':    1599 obs. of  14 variables:
##  $ X                   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ fixed.acidity       : num  7.4 7.8 7.8 11.2 7.4 7.4 7.9 7.3 7.8 7.5 ...
##  $ volatile.acidity    : num  0.7 0.88 0.76 0.28 0.7 0.66 0.6 0.65 0.58 0.5 ...
##  $ citric.acid         : num  0 0 0.04 0.56 0 0 0.06 0 0.02 0.36 ...
##  $ residual.sugar      : num  1.9 2.6 2.3 1.9 1.9 1.8 1.6 1.2 2 6.1 ...
##  $ chlorides           : num  0.076 0.098 0.092 0.075 0.076 0.075 0.069 0.065 0.073 0.071 ...
##  $ free.sulfur.dioxide : num  11 25 15 17 11 13 15 15 9 17 ...
##  $ total.sulfur.dioxide: num  34 67 54 60 34 40 59 21 18 102 ...
##  $ density             : num  0.998 0.997 0.997 0.998 0.998 ...
##  $ pH                  : num  3.51 3.2 3.26 3.16 3.51 3.51 3.3 3.39 3.36 3.35 ...
##  $ sulphates           : num  0.56 0.68 0.65 0.58 0.56 0.56 0.46 0.47 0.57 0.8 ...
##  $ alcohol             : num  9.4 9.8 9.8 9.8 9.4 9.4 9.4 10 9.5 10.5 ...
##  $ quality             : Ord.factor w/ 6 levels "3"<"4"<"5"<"6"<..: 3 3 3 4 3 3 3 5 5 3 ...
##  $ wine.grade          : chr  "B" "B" "B" "B" ...

Observation regarding the sample quality:

  • After adding wine garde column, the sample contains :
    • 217 grade “A” wine.
    • 1319 grade “B” wine.
    • 63 grade “C” wine.
## [1] 217
## [1] 1319
## [1] 63


Univariate Plots Section

Quality

  • As seen below, most of the wine samples have a quality of 5 or 6 (Grade B).
    • Grade “A” have more wines with quality 7.
    • Grade “B” have more wines with quality 5.
    • Grade “C” have more wines with quality 4.

Alcohol (% by volume)

  • Alcohol (% by volume): the percentage of alcohol.
    • The distribution is positively skewed with mean on 10.42 % and having max value of 14.90%

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    8.40    9.50   10.20   10.42   11.10   14.90

pH

  • pH: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale.
    • The distribution is normal with a little outliers . It has mean of 3.31 and maximum value of 4.01

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.740   3.210   3.310   3.311   3.400   4.010

Density (g / cm^3)

  • Density: depends on the amount alcohol and sugar content.
    • The density of wine is close to that of water which is 1.
    • The density distribution is normal.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.9901  0.9956  0.9968  0.9967  0.9978  1.0037

Sulphates (potassium sulphate - g / dm^3) & Free sulfur dioxide (mg / dm^3)

  • Sulphates: a wine additive which can contribute to sulfur dioxide gas (S02) levels, wich acts as an antimicrobial and antioxidant.
    • The distribution is positively skewed with long tail.
    • It has a mean equal to 0.66 and its max. value is 2.
  • Free sulfur dioxide : is the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine.
    • The distribution is positively skewed.
    • It has a mean of 15.87 and its max. value is 72.

## [1] "Sulphates"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.3300  0.5500  0.6200  0.6581  0.7300  2.0000
## [1] "Free sulfur dioxide"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    7.00   14.00   15.87   21.00   72.00

Total sulfur dioxide (mg / dm^3)

  • Total sulfur dioxide : amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine.
    • As seen below the distribution is skewed and has a long tail.
    • In such case, we can transform the feautre using log function to get a better look.
## $x
## [1] "Total sulfur dioxide (g / dm^3)"
## 
## attr(,"class")
## [1] "labels"

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    6.00   22.00   38.00   46.47   62.00  289.00

Residual sugar (g / dm^3)

  • Residual sugar : the amount of sugar remaining after fermentation stops, it’s rare to find wines with less than 1 gram/liter and wines with greater than 45 grams/liter are considered sweet.
    • The distribution is almost normally distributed but has a long tail after transforming using log function.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.900   1.900   2.200   2.539   2.600  15.500

Fixed acidity (tartaric acid - g / dm^3) & Citric acid (g / dm^3)

  • Fixed acidity : fixed acidity: most acids involved with wine or fixed or nonvolatile (do not evaporate readily).
    • The distribution is almost normally distributed.
    • Peaks at 7 it has a mean of 8.32 and a max. value of 15.90
  • Citric acid: found in small quantities, citric acid can add ‘freshness’ and flavor to wines.
    • Positively skewed distribution with mean of 0.27 and max of 1

## [1] "Citric acid"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.090   0.260   0.271   0.420   1.000
## [1] "Fixed acidity"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    4.60    7.10    7.90    8.32    9.20   15.90

Summary of Grade A wine :

##        X          fixed.acidity    volatile.acidity  citric.acid    
##  Min.   :   8.0   Min.   : 4.900   Min.   :0.1200   Min.   :0.0000  
##  1st Qu.: 482.0   1st Qu.: 7.400   1st Qu.:0.3000   1st Qu.:0.3000  
##  Median : 939.0   Median : 8.700   Median :0.3700   Median :0.4000  
##  Mean   : 831.7   Mean   : 8.847   Mean   :0.4055   Mean   :0.3765  
##  3rd Qu.:1089.0   3rd Qu.:10.100   3rd Qu.:0.4900   3rd Qu.:0.4900  
##  Max.   :1585.0   Max.   :15.600   Max.   :0.9150   Max.   :0.7600  
##  residual.sugar    chlorides       free.sulfur.dioxide
##  Min.   :1.200   Min.   :0.01200   Min.   : 3.00      
##  1st Qu.:2.000   1st Qu.:0.06200   1st Qu.: 6.00      
##  Median :2.300   Median :0.07300   Median :11.00      
##  Mean   :2.709   Mean   :0.07591   Mean   :13.98      
##  3rd Qu.:2.700   3rd Qu.:0.08500   3rd Qu.:18.00      
##  Max.   :8.900   Max.   :0.35800   Max.   :54.00      
##  total.sulfur.dioxide    density             pH          sulphates     
##  Min.   :  7.00       Min.   :0.9906   Min.   :2.880   Min.   :0.3900  
##  1st Qu.: 17.00       1st Qu.:0.9947   1st Qu.:3.200   1st Qu.:0.6500  
##  Median : 27.00       Median :0.9957   Median :3.270   Median :0.7400  
##  Mean   : 34.89       Mean   :0.9960   Mean   :3.289   Mean   :0.7435  
##  3rd Qu.: 43.00       3rd Qu.:0.9973   3rd Qu.:3.380   3rd Qu.:0.8200  
##  Max.   :289.00       Max.   :1.0032   Max.   :3.780   Max.   :1.3600  
##     alcohol      quality  wine.grade       
##  Min.   : 9.20   3:  0   Length:217        
##  1st Qu.:10.80   4:  0   Class :character  
##  Median :11.60   5:  0   Mode  :character  
##  Mean   :11.52   6:  0                     
##  3rd Qu.:12.20   7:199                     
##  Max.   :14.00   8: 18

Summary of Grade B wine :

##        X          fixed.acidity    volatile.acidity  citric.acid    
##  Min.   :   1.0   Min.   : 4.700   Min.   :0.1600   Min.   :0.0000  
##  1st Qu.: 382.5   1st Qu.: 7.100   1st Qu.:0.4100   1st Qu.:0.0900  
##  Median : 768.0   Median : 7.800   Median :0.5400   Median :0.2400  
##  Mean   : 793.0   Mean   : 8.254   Mean   :0.5386   Mean   :0.2583  
##  3rd Qu.:1219.5   3rd Qu.: 9.100   3rd Qu.:0.6400   3rd Qu.:0.4000  
##  Max.   :1599.0   Max.   :15.900   Max.   :1.3300   Max.   :0.7900  
##  residual.sugar     chlorides       free.sulfur.dioxide
##  Min.   : 0.900   Min.   :0.03400   Min.   : 1.00      
##  1st Qu.: 1.900   1st Qu.:0.07100   1st Qu.: 8.00      
##  Median : 2.200   Median :0.08000   Median :14.00      
##  Mean   : 2.504   Mean   :0.08897   Mean   :16.37      
##  3rd Qu.: 2.600   3rd Qu.:0.09100   3rd Qu.:22.00      
##  Max.   :15.500   Max.   :0.61100   Max.   :72.00      
##  total.sulfur.dioxide    density             pH          sulphates     
##  Min.   :  6.00       Min.   :0.9901   Min.   :2.860   Min.   :0.3700  
##  1st Qu.: 24.00       1st Qu.:0.9958   1st Qu.:3.210   1st Qu.:0.5400  
##  Median : 40.00       Median :0.9968   Median :3.310   Median :0.6100  
##  Mean   : 48.95       Mean   :0.9969   Mean   :3.311   Mean   :0.6473  
##  3rd Qu.: 65.00       3rd Qu.:0.9979   3rd Qu.:3.400   3rd Qu.:0.7000  
##  Max.   :165.00       Max.   :1.0037   Max.   :4.010   Max.   :1.9800  
##     alcohol      quality  wine.grade       
##  Min.   : 8.40   3:  0   Length:1319       
##  1st Qu.: 9.50   4:  0   Class :character  
##  Median :10.00   5:681   Mode  :character  
##  Mean   :10.25   6:638                     
##  3rd Qu.:10.90   7:  0                     
##  Max.   :14.90   8:  0

Summary of Grade C wine :

##        X          fixed.acidity    volatile.acidity  citric.acid    
##  Min.   :  19.0   Min.   : 4.600   Min.   :0.2300   Min.   :0.0000  
##  1st Qu.: 435.0   1st Qu.: 6.800   1st Qu.:0.5650   1st Qu.:0.0200  
##  Median : 834.0   Median : 7.500   Median :0.6800   Median :0.0800  
##  Mean   : 837.7   Mean   : 7.871   Mean   :0.7242   Mean   :0.1737  
##  3rd Qu.:1285.5   3rd Qu.: 8.400   3rd Qu.:0.8825   3rd Qu.:0.2700  
##  Max.   :1522.0   Max.   :12.500   Max.   :1.5800   Max.   :1.0000  
##  residual.sugar     chlorides       free.sulfur.dioxide
##  Min.   : 1.200   Min.   :0.04500   Min.   : 3.00      
##  1st Qu.: 1.900   1st Qu.:0.06850   1st Qu.: 5.00      
##  Median : 2.100   Median :0.08000   Median : 9.00      
##  Mean   : 2.685   Mean   :0.09573   Mean   :12.06      
##  3rd Qu.: 2.950   3rd Qu.:0.09450   3rd Qu.:15.50      
##  Max.   :12.900   Max.   :0.61000   Max.   :41.00      
##  total.sulfur.dioxide    density             pH          sulphates     
##  Min.   :  7.00       Min.   :0.9934   Min.   :2.740   Min.   :0.3300  
##  1st Qu.: 13.50       1st Qu.:0.9957   1st Qu.:3.300   1st Qu.:0.4950  
##  Median : 26.00       Median :0.9966   Median :3.380   Median :0.5600  
##  Mean   : 34.44       Mean   :0.9967   Mean   :3.384   Mean   :0.5922  
##  3rd Qu.: 48.00       3rd Qu.:0.9977   3rd Qu.:3.500   3rd Qu.:0.6000  
##  Max.   :119.00       Max.   :1.0010   Max.   :3.900   Max.   :2.0000  
##     alcohol      quality  wine.grade       
##  Min.   : 8.40   3:10    Length:63         
##  1st Qu.: 9.60   4:53    Class :character  
##  Median :10.00   5: 0    Mode  :character  
##  Mean   :10.22   6: 0                      
##  3rd Qu.:11.00   7: 0                      
##  Max.   :13.10   8: 0


Univariate Analysis

What is the structure of your dataset?

Dataset contains 1599 observations with 13 variable. A categorical variable has been added (wine.grade).

What is/are the main feature(s) of interest in your dataset?

  • Studying the variation of pH in this wine sample.
  • The relationship between the percentage of alcohol and the resulting quality of wine.

What other features in the dataset do you think will help support your
investigation into your feature(s) of interest?

I suspect that sulphates and the pH index have a deep impact on quality.

Did you create any new variables from existing variables in the dataset?

Yes, I have created the wine.grade variable.

Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form
of the data? If so, why did you do this?

I performed some transformations and taken some quantiles to better understand the graphs but overall the data is tidy.



Bivariate Plots Section

Examining Correlation between sample attributes :

  • There’s a positive correlation between :
    • Quality and Sulphates.
    • Quality and alcohol.
    • Quality and citric acid.
  • There’s a negative correlation between Quality and Volatile Acidity.

  • There’s negative correlation between pH and (Citric acid - Fixed acidity).


Quality & Alcohol :

  • In the boxplots below, the graph suggests that the higher the quality of wine the higher the alcohol it contains. But this’s not the case for wines with quality equal to 5.

## 
##  Pearson's product-moment correlation
## 
## data:  redwine$alcohol and redwine$quality
## t = 21.639, df = 1597, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4373540 0.5132081
## sample estimates:
##       cor 
## 0.4761663
  • Question : If we keep on increasing the alcholo percentage does this always results in higher quality ?
    • Answer : As shown in the smoothed graph below (smoothed to remove noise) based on the sample data the quality of wine keeps on increasing as the alcohol percentage increases until it hits 13% after this percentage the quality of wine starts to degrade.

## NULL
## 
##  Pearson's product-moment correlation
## 
## data:  alcohol_above_13$quality and alcohol_above_13$alcohol
## t = -0.39861, df = 21, p-value = 0.6942
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4816540  0.3376049
## sample estimates:
##         cor 
## -0.08665653

Quality & Suplhates :

  • As expected, based on the sample data, as the sulphates content increases the quality of wine increases.

## 
##  Pearson's product-moment correlation
## 
## data:  redwine$quality and redwine$sulphates
## t = 10.38, df = 1597, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2049011 0.2967610
## sample estimates:
##       cor 
## 0.2513971

Quality & Citric Acid :

  • As expected, based on the sample data, as the sulphates content increases the quality of wine increases.

## 
##  Pearson's product-moment correlation
## 
## data:  redwine$quality and redwine$citric.acid
## t = 9.2875, df = 1597, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1793415 0.2723711
## sample estimates:
##       cor 
## 0.2263725

pH with Citric Acid and Fixed acidity :

  • Of course it’s expected to have a negative correlation between acidity and pH index since it’s an index to measure acidity (More acidic solutions have lower pH).
    • pH & Citric Acid

## 
##  Pearson's product-moment correlation
## 
## data:  redwine$pH and redwine$citric.acid
## t = -25.767, df = 1597, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.5756337 -0.5063336
## sample estimates:
##        cor 
## -0.5419041
+ pH and Fixed Acidity

## 
##  Pearson's product-moment correlation
## 
## data:  redwine$pH and redwine$fixed.acidity
## t = -37.366, df = 1597, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.7082857 -0.6559174
## sample estimates:
##        cor 
## -0.6829782


Bivariate Analysis

Talk about some of the relationships you observed in this part of the investigation.

  • pH with Fixed and Citric Acidity as expected negative correlation between pH and these features.
  • Alcohol & Quality :
  • Above 13% Alcohol : the quality of wine degrades (negative correlation).
  • Under 13% Alcohol : the quality of wine increase (positive correlation).
  • Based on the sample data wine quality increases as the sulphates and citric acids contents increase.

Did you observe any interesting relationships between the other features
(not the main feature(s) of interest)?

Yeah some relationships that aren’t part of this analysis such as a relationship between free sulfur dioxide and total sulfurdioxide.

What was the strongest relationship you found?

The strongest realtionship is between fixed acidity and pH index which’s equal to -0.68 (strong negative correlation), then again this was expected.



Multivariate Plots Section

Alcohol with other variables (pH & sulphates) :

  • As shown in the graph below, higher sulphates content and higher alcohol content (but must be below 13%) yields better wine quality.

  • Based on the graph below, Low pH and high alcohol concentration (but must be below 13%) seem to be a good fit.

Acids : Fixed Acidity with Citric Acidity :

  • From the graph below, there’s a correlation between Fixed Acidity and Citric Acidity contents. However, nothing in terms of quality.

## 
##  Pearson's product-moment correlation
## 
## data:  redwine$citric.acid and redwine$fixed.acidity
## t = 36.234, df = 1597, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6438839 0.6977493
## sample estimates:
##       cor 
## 0.6717034

Linear Model using critical variables :

## 
## Calls:
## m1: lm(formula = as.numeric(quality) ~ alcohol, data = training_data)
## m2: lm(formula = as.numeric(quality) ~ alcohol + sulphates, data = training_data)
## m3: lm(formula = as.numeric(quality) ~ alcohol + sulphates + volatile.acidity, 
##     data = training_data)
## m4: lm(formula = as.numeric(quality) ~ alcohol + sulphates + volatile.acidity + 
##     citric.acid, data = training_data)
## m5: lm(formula = as.numeric(quality) ~ alcohol + sulphates + volatile.acidity + 
##     citric.acid + fixed.acidity, data = training_data)
## m6: lm(formula = as.numeric(quality) ~ alcohol + sulphates + pH, 
##     data = training_data)
## 
## =====================================================================================================
##                          m1            m2           m3           m4           m5            m6       
## -----------------------------------------------------------------------------------------------------
##   (Intercept)           0.070        -0.462*       0.785**      0.761**      0.272         2.101***  
##                        (0.225)       (0.228)      (0.252)      (0.260)      (0.290)       (0.515)    
##   alcohol               0.343***      0.322***     0.288***     0.288***     0.302***      0.349***  
##                        (0.021)       (0.021)      (0.020)      (0.020)      (0.020)       (0.021)    
##   sulphates                           1.149***     0.776***     0.767***     0.759***      0.979***  
##                                      (0.144)      (0.142)      (0.144)      (0.143)       (0.145)    
##   volatile.acidity                                -1.242***    -1.212***    -1.314***                
##                                                   (0.128)      (0.148)      (0.149)                  
##   citric.acid                                                   0.054       -0.402*                  
##                                                                (0.135)      (0.182)                  
##   fixed.acidity                                                              0.063***                
##                                                                             (0.017)                  
##   pH                                                                                      -0.827***  
##                                                                                           (0.150)    
## -----------------------------------------------------------------------------------------------------
##   R-squared             0.211         0.261        0.327        0.328        0.337         0.284     
##   adj. R-squared        0.211         0.259        0.325        0.325        0.334         0.281     
##   sigma                 0.722         0.700        0.668        0.668        0.663         0.689     
##   F                   256.479       168.592      154.970      116.165       96.922       126.040     
##   p                     0.000         0.000        0.000        0.000        0.000         0.000     
##   Log-likelihood    -1047.617     -1016.611     -971.283     -971.203     -964.337     -1001.526     
##   Deviance            499.109       467.857      425.655      425.585      419.534       453.367     
##   AIC                2101.233      2041.222     1952.565     1954.407     1942.675      2013.052     
##   BIC                2115.831      2060.686     1976.895     1983.602     1976.736      2037.381     
##   N                   959           959          959          959          959           959         
## =====================================================================================================



Multivariate Analysis

Talk about some of the relationships you observed in this part of the
investigation.

High alcohol percentages (below 13%) and high sulphate contents combined result in better wines.

Were there any interesting or surprising interactions between features?, And Did you create any models with your dataset? Discuss the strengths and limitations of your model.

  • Low R squared score suggest that there is missing information to correctly predict quality.



Final Plots and Summary

Plot One

## NULL

## 
##  Pearson's product-moment correlation
## 
## data:  alcohol_above_13$quality and alcohol_above_13$alcohol
## t = -0.39861, df = 21, p-value = 0.6942
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4816540  0.3376049
## sample estimates:
##         cor 
## -0.08665653

Description One :

Based on the sample data the quality of wine keeps on increasing as the alcohol percentage increases until it hits 13% after this percentage the quality of wine starts to degrade.

Plot Two

Description Two

As shown in the graph above, higher sulphates content and higher alcohol content (but must be below 13%) yields better wine quality.

Plot Three

Description Three

Around 35% of the variance in quality could be explained with the highest R squared by the linear model.


Reflection :